Goto

Collaborating Authors

 query optimizer


ARCADE: A Real-Time Data System for Hybrid and Continuous Query Processing across Diverse Data Modalities

Yang, Jingyi, Mo, Songsong, Shi, Jiachen, Yu, Zihao, Shi, Kunhao, Ding, Xuchen, Cong, Gao

arXiv.org Artificial Intelligence

The explosive growth of multimodal data - spanning text, image, video, spatial, and relational modalities, coupled with the need for real-time semantic search and retrieval over these data - has outpaced the capabilities of existing multimodal and real-time database systems, which either lack efficient ingestion and continuous query capability, or fall short in supporting expressive hybrid analytics. We introduce ARCADE, a real-time data system that efficiently supports high-throughput ingestion and expressive hybrid and continuous query processing across diverse data types. ARCADE introduces unified disk-based secondary index on LSM-based storage for vector, spatial, and text data modalities, a comprehensive cost-based query optimizer for hybrid queries, and an incremental materialized view framework for efficient continuous queries. Built on open-source RocksDB storage and MySQL query engine, ARCADE outperforms leading multimodal data systems by up to 7.4x on read-heavy and 1.4x on write-heavy workloads.


LLM4Hint: Leveraging Large Language Models for Hint Recommendation in Offline Query Optimization

Liu, Suchen, Gao, Jun, Han, Yinjun, Lin, Yang

arXiv.org Artificial Intelligence

Query optimization is essential for efficient SQL query execution in DBMS, and remains attractive over time due to the growth of data volumes and advances in hardware. Existing traditional optimizers struggle with the cumbersome hand-tuning required for complex workloads, and the learning-based methods face limitations in ensuring generalization. With the great success of Large Language Model (LLM) across diverse downstream tasks, this paper explores how LLMs can be incorporated to enhance the generalization of learned optimizers. Though promising, such an incorporation still presents challenges, mainly including high model inference latency, and the substantial fine-tuning cost and suboptimal performance due to inherent discrepancy between the token sequences in LLM and structured SQL execution plans with rich numerical features. In this paper, we focus on recurring queries in offline optimization to alleviate the issue of high inference latency, and propose \textbf{LLM4Hint} that leverages moderate-sized backbone LLMs to recommend query optimization hints. LLM4Hint achieves the goals through: (i) integrating a lightweight model to produce a soft prompt, which captures the data distribution in DBMS and the SQL predicates to provide sufficient optimization features while simultaneously reducing the context length fed to the LLM, (ii) devising a query rewriting strategy using a larger commercial LLM, so as to simplify SQL semantics for the backbone LLM and reduce fine-tuning costs, and (iii) introducing an explicit matching prompt to facilitate alignment between the LLM and the lightweight model, which can accelerate convergence of the combined model. Experiments show that LLM4Hint, by leveraging the LLM's stronger capability to understand the query statement, can outperform the state-of-the-art learned optimizers in terms of both effectiveness and generalization.


AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference

Deng, Yangshen, You, Zhengxin, Xiang, Long, Li, Qilong, Yuan, Peiqi, Hong, Zhaoyang, Zheng, Yitao, Li, Wanting, Li, Runzhong, Liu, Haotian, Mouratidis, Kyriakos, Yiu, Man Lung, Li, Huan, Shen, Qiaomu, Mao, Rui, Tang, Bo

arXiv.org Artificial Intelligence

AlayaDB is a cutting-edge vector database system natively architected for efficient and effective long-context inference for Large Language Models (LLMs) at AlayaDB AI. Specifically, it decouples the KV cache and attention computation from the LLM inference systems, and encapsulates them into a novel vector database system. For the Model as a Service providers (MaaS), AlayaDB consumes fewer hardware resources and offers higher generation quality for various workloads with different kinds of Service Level Objectives (SLOs), when comparing with the existing alternative solutions (e.g., KV cache disaggregation, retrieval-based sparse attention). The crux of AlayaDB is that it abstracts the attention computation and cache management for LLM inference into a query processing procedure, and optimizes the performance via a native query optimizer. In this work, we demonstrate the effectiveness of AlayaDB via (i) three use cases from our industry partners, and (ii) extensive experimental results on LLM inference benchmarks.


Neuro-Symbolic Query Optimization in Knowledge Graphs

Acosta, Maribel, Qin, Chang, Schwabe, Tim

arXiv.org Artificial Intelligence

This chapter delves into the emerging field of neuro-symbolic query optimization for knowledge graphs (KGs), presenting a comprehensive exploration of how neural and symbolic techniques can be integrated to enhance query processing. Traditional query optimizers in knowledge graphs rely heavily on symbolic methods, utilizing dataset summaries, statistics, and cost models to select efficient execution plans. However, these approaches often suffer from misestimations and inaccuracies, particularly when dealing with complex queries or large-scale datasets. Recent advancements have introduced neural models, which capture non-linear aspects of query optimization, offering promising alternatives to purely symbolic methods. In this chapter, we introduce neuro-symbolic query optimizers, a novel approach that combines the strengths of symbolic reasoning with the adaptability of neural computation. We discuss the architecture of these hybrid systems, highlighting the interplay between neural and symbolic components to improve the optimizer's ability to navigate the search space and produce efficient execution plans. Additionally, the chapter reviews existing neural components tailored for optimizing queries over knowledge graphs and examines the limitations and challenges in deploying neuro-symbolic query optimizers in real-world environments.


Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models

Cong, Youan, Wang, Cheng, Akash, Pritom Saha, Chang, Kevin Chen-Chuan

arXiv.org Artificial Intelligence

We introduce the Extract-Refine-Retrieve-Read (ERRR) framework, a novel approach designed to bridge the pre-retrieval information gap in Retrieval-Augmented Generation (RAG) systems through query optimization tailored to meet the specific knowledge requirements of Large Language Models (LLMs). Unlike conventional query optimization techniques used in RAG, the ERRR framework begins by extracting parametric knowledge from LLMs, followed by using a specialized query optimizer for refining these queries. This process ensures the retrieval of only the most pertinent information essential for generating accurate responses. Moreover, to enhance flexibility and reduce computational costs, we propose a trainable scheme for our pipeline that utilizes a smaller, tunable model as the query optimizer, which is refined through knowledge distillation from a larger teacher model. Our evaluations on various question-answering (QA) datasets and with different retrieval systems show that ERRR consistently outperforms existing baselines, proving to be a versatile and cost-effective module for improving the utility and accuracy of RAG systems.


GenJoin: Conditional Generative Plan-to-Plan Query Optimizer that Learns from Subplan Hints

Sulimov, Pavel, Lehmann, Claude, Stockinger, Kurt

arXiv.org Artificial Intelligence

Query optimization has become a research area where classical algorithms are being challenged by machine learning algorithms. At the same time, recent trends in learned query optimizers have shown that it is prudent to take advantage of decades of database research and augment classical query optimizers by shrinking the plan search space through different types of hints (e.g. by specifying the join type, scan type or the order of joins) rather than completely replacing the classical query optimizer with machine learning models. It is especially relevant for cases when classical optimizers cannot fully enumerate all logical and physical plans and, as an alternative, need to rely on less robust approaches like genetic algorithms. However, even symbiotically learned query optimizers are hampered by the need for vast amounts of training data, slow plan generation during inference and unstable results across various workload conditions. In this paper, we present GenJoin - a novel learned query optimizer that considers the query optimization problem as a generative task and is capable of learning from a random set of subplan hints to produce query plans that outperform the classical optimizer. GenJoin is the first learned query optimizer that significantly and consistently outperforms PostgreSQL as well as state-of-the-art methods on two well-known real-world benchmarks across a variety of workloads using rigorous machine learning evaluations.


The Unreasonable Effectiveness of LLMs for Query Optimization

Akioyamen, Peter, Yi, Zixuan, Marcus, Ryan

arXiv.org Artificial Intelligence

Recent work in database query optimization has used complex machine learning strategies, such as customized reinforcement learning schemes. Surprisingly, we show that LLM embeddings of query text contain useful semantic information for query optimization. Specifically, we show that a simple binary classifier deciding between alternative query plans, trained only on a small number of labeled embedded query vectors, can outperform existing heuristic systems. Although we only present some preliminary results, an LLM-powered query optimizer could provide significant benefits, both in terms of performance and simplicity.


NeurDB: An AI-powered Autonomous Data System

Ooi, Beng Chin, Cai, Shaofeng, Chen, Gang, Shen, Yanyan, Tan, Kian-Lee, Wu, Yuncheng, Xiao, Xiaokui, Xing, Naili, Yue, Cong, Zeng, Lingze, Zhang, Meihui, Zhao, Zhanhao

arXiv.org Artificial Intelligence

In the wake of rapid advancements in artificial intelligence (AI), we stand on the brink of a transformative leap in data systems. The imminent fusion of AI and DB (AIxDB) promises a new generation of data systems, which will relieve the burden on end-users across all industry sectors by featuring AI-enhanced functionalities, such as personalized and automated in-database AI-powered analytics, self-driving capabilities for improved system performance, etc. In this paper, we explore the evolution of data systems with a focus on deepening the fusion of AI and DB. We present NeurDB, an AI-powered autonomous data system designed to fully embrace AI design in each major system component and provide in-database AI-powered analytics. We outline the conceptual and architectural overview of NeurDB, discuss its design choices and key components, and report its current development and future plan.


A Novel Technique for Query Plan Representation Based on Graph Neural Nets

Chang, Baoming, Kamali, Amin, Kantere, Verena

arXiv.org Artificial Intelligence

Learning representations for query plans play a pivotal role in machine learning-based query optimizers of database management systems. To this end, particular model architectures are proposed in the literature to transform the tree-structured query plans into representations with formats learnable by downstream machine learning models. However, existing research rarely compares and analyzes the query plan representation capabilities of these tree models and their direct impact on the performance of the overall optimizer. To address this problem, we perform a comparative study to explore the effect of using different state-of-the-art tree models on the optimizer's cost estimation and plan selection performance in relatively complex workloads. Additionally, we explore the possibility of using graph neural networks (GNNs) in the query plan representation task. We propose a novel tree model BiGG employing Bidirectional GNN aggregated by Gated recurrent units (GRUs) and demonstrate experimentally that BiGG provides significant improvements to cost estimation tasks and relatively excellent plan selection performance compared to the state-of-the-art tree models.


Budget-aware Query Tuning: An AutoML Perspective

Wu, Wentao, Wang, Chi

arXiv.org Artificial Intelligence

Modern database systems rely on cost-based query optimizers to come up with good execution plans for input queries. Such query optimizers rely on cost models to estimate the costs of candidate query execution plans. A cost model represents a function from a set of cost units to query execution cost, where each cost unit specifies the unit cost of executing a certain type of query processing operation (such as table scan or join). These cost units are traditionally viewed as constants, whose values only depend on the platform configuration where the database system runs on top of but are invariant for queries processed by the database system. In this paper, we challenge this classic view by thinking of these cost units as variables instead. We show that, by varying the cost-unit values one can obtain query plans that significantly outperform the default query plans returned by the query optimizer when viewing the cost units as constants. We term this cost-unit tuning process "query tuning" (QT) and show that it is similar to the well-known hyper-parameter optimization (HPO) problem in AutoML. As a result, any state-of-the-art HPO technologies can be applied to QT. We study the QT problem in the context of anytime tuning, which is desirable in practice by constraining the total time spent on QT within a given budget -- we call this problem budget-aware query tuning. We further extend our study from tuning a single query to tuning a workload with multiple queries, and we call this generalized problem budget-aware workload tuning (WT), which aims for minimizing the execution time of the entire workload. WT is more challenging as one needs to further prioritize individual query tuning within the given time budget. We propose solutions to both QT and WT and experimental evaluation using both benchmark and real workloads demonstrates the efficacy of our proposed solutions.